The Generation and Use of Regression Class Trees for Mllr Adaptation
نویسنده
چکیده
Maximum likelihood linear regression (MLLR) is an adaptation technique suitable for both speaker and environmental model-based adaptation. The models are adapted using a set of linear transformations, estimated in a maximum likelihood fashion from the available adaptation data. As these transformations can capture general relationships between the original model set and the current speaker, or new acoustic environment, they can be e ective in adapting all the HMM distributions with limited adaptation data. Two important decisions that must be made are (i) how to cluster components together, such that they all have a similar transformation matrix, and (ii) how many transformation matrices to generate for a given block of adaptation data. This paper addresses both problems. Firstly it describes two optimal clustering techniques, in the sense of maximising the likelihood of the adaptation data. The rst assigns each component to one of the regression classes. This may be used to generate standard regression class trees. The second scheme performs a fuzzy assignment of base class to regression class, so the transformation associated with each component is a linear combination of a set of transformations. Secondly two schemes are examined which address the problem of how to determine the number of regression classes, transforms, for a given amount of adaptation data. Two schemes are examined here. A cross-validation scheme based on the auxiliary function of the adaptation data is described. Another scheme based on the use of iterative MLLR is also detailed. Both these schemes require no a-priori thresholding information. An initial evaluation of the techniques was performed using data from the ARPA 1994 test data. On this task, though \good" trees, in terms of the likelihood of the adaptation training data were generated, neither of the optimal clustering schemes yielded gains in recognition performance. The performance of the cross-validation scheme was found to be comparable to an empirically determined threshold scheme. The best performance was achieved using iterative MLLR, which outperformed both xed classes and threshold based schemes.
منابع مشابه
A novel target-driven MLLR adaptation algorithm with multi-layer structure
This paper presents a novel target-driven MLLR adaptation algorithm with multiply layer structure, which is based on the thorough analysis of MLLR using the generation of regression class trees. The new algorithm is constructed on the targetdriven principal. It generates the regression class dynamically, basing on the outcome of the former MLLR transformation. The regression classes is defined ...
متن کاملSpeaker clustered regression-class trees for MLLR adaptation
A speaker clustering algorithm is presented that is based on an eigenspace representation of Maximum Likelihood Linear Regression (MLLR) transformations and is used for training cluster-dependent regression-class trees for MLLR adaptation. It is shown that significant automatic speech recognition (ASR) system performance gains are possible by choosing the best regression-class tree structure fo...
متن کاملA Novel Target-driven Mllr Adapatation Algorithm with Multi-layer Structure
This paper presents a novel target-driven MLLR adaptation algorithm with multiply layer structure, which is based on the thorough analysis of MLLR using the generation of regression class trees. The new algorithm is constructed on the targetdriven principal. It generates the regression class dynamically, basing on the outcome of the former MLLR transformation. The regression classes is defined ...
متن کاملImproving robustness of MLLR adaptation with speaker-clustered regression class trees
We introduce a strategy for modeling speaker variability in speaker adaptation based on maximum likelihood linear regression (MLLR). The approach uses a speaker clustering procedure that models speaker variability by partitioning a large corpus of speakers in the eigenspace of their MLLR transformations and learning clusterspecific regression class tree structures. We present experiments showin...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996